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[Title of the Invention] Matching Method in Speech Recognition 



5 [Scope of Claim for Patent] 

[Claim 1] A matching method in speech recognition for 
recognizing an input speech through pattern matching with an 
indicator pattern by providing a plurality of unit standard 
patterns every unit forming said speech for matching a pattern 

10 obtained by connecting those selected from said plurality of 
unit standard patterns with an input pattern while selecting 
said unit standard patterns by preliminarily comparing said 
input pattern with said unit standard patterns. 
[Detailed Description of the Invention] 

15 (Industrial Field of Application) 

The present invention relates to a matching method in 
speech recognition, and more particularly, it relates to a 
matching method in speech recognition aiming at improving a 
technique for making recognition by pattern matching. 

20 (Prior Art) 

As a method of speech recognition, a pattern matching 
method of previously registering a pattern to be recognized 
as a standard pattern for comparing unknown patterns input in 
recognition with the standard pattern and deciding the most 

25 similar one as the result of recognition is widely employed 




2 

in general . 

In such a method employing the standard pattern, the 
burden of registration of standard patterns is increased as 
the number of categories of objects to be recognized is 
5 increased, while the storage capacity for holding the standard 
patterns is also increased at the same time. In order to solve 
this problem, a method of recognizing a speech as connection 
of finer units, preparing standard patterns for the respective 
units and connecting these unit standard patterns with. each 

10 other thereby forming a standard pattern is attempted. 

Syllables are typical as such units, and arbitrary 
speeches can be formed by connecting about 100 types of 
syllables in the case of Japanese. 

In the pattern matching method, it is generally effective 

15 to employ not only a single standard pattern but also a plurality 
of standard patterns for each category, in order to improve 
the precision of recognition. In the case of employing the 
aforementioned unit standard patterns, however, the number of 
manners of connecting the patterns is remarkably increased if 

20 the number of the unit standard patterns is increased. 
Considering a word "Kawasaki", for example, the number of 
manners of connecting patterns reaches 5 4 to result in an 
enormous quantity of matching processing when five syllable 
standard patterns (unit standard patterns) are simply prepared 

25 for each syllable. While this quantity of processing is 
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remarkably reduced when a method called a two-stage DP (dynamic 
programming) matching method employed in connective speech 
recognition is applied, the quantity of processing is still 
large. The algorithm of the two-stage DP matching method is 
5 described in detail in "Nikkei Electronics", No. 1, 1983, Vol. 
4, pp. 171 to 207. 

(Problem to be Solved by the Invention) 

In the conventional matching method for this type of 
speech recognition, as hereinabove described, the number of 
10 connection is remarkably increased in correspondence to the 
number of unit standard patterns to unavoidably result in an 
enormous quantity of processing in pattern matching. 

In order to solve the aforementioned problem, an object 
of the present invention is to provide a matching method in 
15 speech recognition not increasing the quantity of processing 
also when utilizing a plurality of unit standard patterns. 

(Structure of the Invention) 

According to the present invention, a matching method 
in speech recognition for recognizing an input speech through 

20 pattern matching with an indicator pattern comprises means 
providing a plurality of unit standard patterns every unit 
forming the speech for matching a pattern obtained by 
connecting those selected from the said plurality of unit 
standard patterns with an input pattern while selecting the 

25 said unit standard patterns by preliminarily comparing the 
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input pattern with the unit standard patterns . 
( Function ) 

The function of the present invention is described with 
reference to a case of forming units of standard patterns by 
5 syllables in word recognition. 

It is assumed that an input word pattern is expressed 
as a series A = {a( j) , j = 1, ***, J} of feature vectors by speech 
analysis (FFT analysis, LPC analysis or the like) . It is also 
assumed that the word to be recognized is expressed as a series 
10 of syllable symbols. This is expressed as S = {s(p), p = 1, 

It is assumed that syllable standard patterns are 
registered in advance of recognition and expressed as a series 
of feature vectors similarly to the input pattern. This is 
15 expressed as B mk {b mk (i), i = 1, I}. B mk represents a k-th 
standard pattern of a syllable m. 

The point is to obtain the degree of similarity between 
the input pattern A and the syllable symbol series S. When 
the number K(m) of standard patterns of each syllable is 1, 
20 a word standard pattern can be readily formed by connecting 
the standard patterns of the syllables in order of the syllable 
symbol series. 

In this case, it is possible to apply a time axis 
normalization matching method, the so-called DP matching 
25 method (Journal of the Acoustical Society of Japan, Vol. 27, 
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No. 9, P. 483) employing the dynamic programming widely put 
into practice as a typical pattern matching method. 

When the number K(m) is at least 2, a plurality of word 
standard patterns can be formed in the aforementioned manner, 
5 to increase the quantity of processing of word matching . When 
extracting that conceivably most similar to the input pattern 
from K(m) syllable standard patterns every syllable and 
connecting such patterns with each other, a single word 
standard pattern is formed and the quantity of processing can 

10 be remarkably reduced. 

Such a method includes the so-called syllable 
segmentation of properly slicing a section having a possibility 
of including a certain syllable from an input speech and 
obtaining the degree of similarity to each syllable standard 

15 pattern in this section. 

Fig. 2 is a principle diagram of syllable segmentation 
in the present invention. With reference to a case of forming 
word standard patterns of a word "riku" having two syllables, 
for example. Fig. 2 shows that the degree of similarity between 

20 a section A of an input pattern and a standard pattern of "ri" 
and the degree of similarity between a section B and a standard 
pattern of "ku" may be obtained respectively. In this case, 
a value expressing the inclusion relation between the two 
patterns, such as a value A employed in a method described in 

25 Japanese Patent Application No . 60-153217 (1985), for example. 



can be employed. 

The sections A and B can be arbitrarily obtained by a 
method utilizing features of amplitude patterns or a method 
simply equally dividing the input section. Referring to Fig. 
5 2, a point X equally dividing the input section [S, E] into 
two sections is obtained for giving the sections as A = [S, 
X + e] and B = [X -e, E] with a proper tolerance limit e. 
[ Embodiment ] 

The present invention is now described in detail with 

10 reference to Fig. 1. 

Fig. 1 is a block diagram showing an embodiment of the 
present invention. Referring to Fig. 1, the contents of an 
input pattern buffer 1 are subjected to pattern matching with 
respective words of a word dictionary 5 . A syllable 

15 segmentation processing part 2 hypothesizes any word stored 
in the word dictionary 5 with reference to an input pattern 
of the buffer 1 and obtains sections corresponding to 
respective syllables. A syllable standard pattern selection 
part 3 compares the patterns of the respective syllable 

20 sections obtained by the processing part 2 with respective 
syllable standard patterns (those having syllable names 
matching with the corresponding syllables in the hypothesized 
word) in a standard pattern buffer 6 and outputs the most similar 
syllable standard patterns to a connected standard pattern 

25 buffer 7. The connected standard pattern buffer 7 stores a 
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pattern obtained by connecting the syllable standard patterns 
decided by the selection part 3 as a word standard pattern. 
A word matching part 4 matches the input pattern on the pattern 
buffer 1 with the word standard pattern on the pattern buffer 
5 7 and outputs the degree of similarity. 

In the aforementioned embodiment, the segmentation 
processing part 2 can be formed by an arbitrary one such as 
that simply equally dividing a word in response to the number 
of syllables, that simply utilizing specific information 

10 (having a peak of energy including a silent part etc. ) or the 
like. The standard pattern selection part 3 can also be formed 
by an arbitrary one so far as the same can determine similarity 
between a partial pattern of an input pattern and a standard 
pattern, while the advantage of the present invention resides 

15 in improvement of efficiency and hence that implementing simple 
calculation is preferable. The method described in Japanese 
Patent Application No. 60-153217 is an example thereof. The 
word matching part 4 can also be formed by an arbitrary one, 
while it is well known that the DP matching method is superior 

20 in precision of recognition, and this is most appropriate in 
this case. 

(Effect of the Invention) 

According to the present invention, as hereinabove 
described, matching can be performed without increasing the 
25 quantity of processing while employing a plurality of unit 
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standard patterns, whereby a matching method in speech 
recognition capable of implementing pattern matching 
remarkably improving precision of recognition and efficiency 
can be effectively implemented. 
5 [Brief Description of the Drawings 3 

Fig. 1 is a block diagram showing an embodiment of the 
present invention , and Fig . 2 is a principle diagram of syllable 
segmentation in the present invention. 

1 ... input pattern buffer, 2 ... segmentation processing 
10 part, 3 ... standard pattern selection part, 4 ... word matching 
part, 5 ... word dictionary, 6 ... standard pattern buffer, 7 
... connected standard pattern buffer. 
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